Parallel Retrieval of Dense Vectors in the Vector Space Model

نویسندگان

  • Tobias Berka
  • Marián Vajtersic
چکیده

Modern information retrieval systems use distributed and parallel algorithms to meet their operational requirements, and commonly operate on sparse vectors. But dimensionality-reducing techniques produce dense and relatively short feature vectors. Motivated by this relevance of dense vectors, we have parallelized the vector space model for dense matrices and vectors. Our algorithm uses a hybrid partitioning splitting documents and features and operates on a mesh of hosts holding a block partitioned corpus matrix. We show that the theoretic speed-up is optimal. The empirical evaluation of an MPI-based implementation reveals that we obtain a super-linear speed-up on a cluster using Nehalem Xeon CPUs. A version of this report has been published as “Tobias Berka and Marian Vajteršic: Parallel Retrieval of Dense Vectors in the Vector Space Model. Computing and Informatics (CAI), 2, 2011.”

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Second dual space of little $alpha$-Lipschitz vector-valued operator algebras

Let $(X,d)$ be an infinite compact metric space, let $(B,parallel . parallel)$ be a unital Banach space, and take $alpha in (0,1).$ In this work, at first we define the big and little $alpha$-Lipschitz vector-valued (B-valued) operator algebras, and consider the little $alpha$-lipschitz $B$-valued operator algebra, $lip_{alpha}(X,B)$. Then we characterize its second dual space.

متن کامل

Vector Space semi-Cayley Graphs

The original aim of this paper is to construct a graph associated to a vector space. By inspiration of the classical definition for the Cayley graph related to a group we define Cayley graph of a vector space. The vector space Cayley graph ${rm Cay(mathcal{V},S)}$ is a graph with the vertex set the whole vectors of the vector space $mathcal{V}$ and two vectors $v_1,v_2$ join by an edge whenever...

متن کامل

Hoph Hypersurfaces of Sasakian Space Form with Parallel Ricci Operator Esmaiel Abedi, Mohammad Ilmakchi Department of Mathematics, Azarbaijan Shahid Madani University, Tabriz, Iran

Let M^2n be a hoph hypersurfaces with parallel ricci operator and tangent to structure vector field in Sasakian space form. First, we show that structures and properties of hypersurfaces and hoph hypersurfaces in Sasakian space form. Then we study the structure of hypersurfaces and hoph hypersurfaces with a parallel ricci tensor structure and show that there are two cases. In the first case, th...

متن کامل

Dimensions of Meaning

The representation of documents and queries as vectors in a high-dimensional space is well-established in information retrieval 1]. This paper proposes to represent the semantics of words and contexts in a text as vectors. The dimensions of the space are words and the initial vectors are determined by the words occurring close to the entity to be represented which implies that the space has sev...

متن کامل

Analysis of Vector Space Model in Information Retrieval

Information retrieval is great technology behind web search services. In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The vector space model is one of the classical and widely applied retrieval models to evaluate relevance of web page. The retrieval operation consists of computing the cosine similarity function between a g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computing and Informatics

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2011